Conversation
* [Standup] Convert step 5 to python Signed-off-by: maugustosilva <maugusto.silva@gmail.com> * Small bugfixes Signed-off-by: maugustosilva <maugusto.silva@gmail.com> --------- Signed-off-by: maugustosilva <maugusto.silva@gmail.com>
* initial wva --------- Signed-off-by: vezio <tyler.rimaldi@ibm.com> Signed-off-by: Vezio <31221081+Vezio@users.noreply.github.com>
* Bump guidellm in container to devel build * Update guidellm harness to use scenarios and fix convert script * Add a few basic workload examples * Bump GuideLLM * GuideLLM now converts single dataset to a list automatically * Install CPU version of torch to cut down on install size
* Quote annotations * Also escape labels
Signed-off-by: Michael Kalantar <kalantar@us.ibm.com>
* Allow import and conversion of all runs in GuideLLM results Signed-off-by: Nick Masluk <nick@randombytes.net> * autopep8 Signed-off-by: Nick Masluk <nick@randombytes.net> * Update help Signed-off-by: Nick Masluk <nick@randombytes.net> --------- Signed-off-by: Nick Masluk <nick@randombytes.net>
* Add bounds to scenarios Signed-off-by: Nick Masluk <nick@randombytes.net> * Rename column variables Signed-off-by: Nick Masluk <nick@randombytes.net> * Add more bound handling functions, support bounds in plots Signed-off-by: Nick Masluk <nick@randombytes.net> * Complete implementation Signed-off-by: Nick Masluk <nick@randombytes.net> * Remove stale code Signed-off-by: Nick Masluk <nick@randombytes.net> * Fix comment Signed-off-by: Nick Masluk <nick@randombytes.net> * Address comments Signed-off-by: Nick Masluk <nick@randombytes.net> * Fix imports Signed-off-by: Nick Masluk <nick@randombytes.net> * Add constants.py link Signed-off-by: Nick Masluk <nick@randombytes.net> * Fix missing concurrency handling in GuideLLM Signed-off-by: Nick Masluk <nick@randombytes.net> * Return dict to label figures Signed-off-by: Nick Masluk <nick@randombytes.net> * Update columns Signed-off-by: Nick Masluk <nick@randombytes.net> * Fix constant name Signed-off-by: Nick Masluk <nick@randombytes.net> * Add back hack Signed-off-by: Nick Masluk <nick@randombytes.net> * Fix fstring Signed-off-by: Nick Masluk <nick@randombytes.net> --------- Signed-off-by: Nick Masluk <nick@randombytes.net>
Signed-off-by: vezio <tyler.rimaldi@ibm.com>
* Add more GuideLLM parameters to DataFrame Signed-off-by: Nick Masluk <nick@randombytes.net> * Address comments Signed-off-by: Nick Masluk <nick@randombytes.net> --------- Signed-off-by: Nick Masluk <nick@randombytes.net>
* Enable bounds in i/o length in config UI Signed-off-by: Jing Chen <jing.chen2@ibm.com> * Update Signed-off-by: Jing Chen <jing.chen2@ibm.com> * Rm unused lines Signed-off-by: Jing Chen <jing.chen2@ibm.com> * Address feedback Signed-off-by: Jing Chen <jing.chen2@ibm.com> --------- Signed-off-by: Jing Chen <jing.chen2@ibm.com>
…ds (#492) Signed-off-by: Nick Masluk <nick@randombytes.net>
* Allow symbolic link recursion for Python versions that support it Signed-off-by: Nick Masluk <nick@randombytes.net> * Avoid merge conflict with HACK removal PR Signed-off-by: Nick Masluk <nick@randombytes.net> * Avoid multi-line f-string for Python 3.11 Signed-off-by: Nick Masluk <nick@randombytes.net> --------- Signed-off-by: Nick Masluk <nick@randombytes.net>
* [Run] Removed `fmperf` as a supported harness Since `fmperf` has stopped active development, we cease support for it Additionally, removed "service account" definition from the benchmark launcher pod --------- Signed-off-by: maugustosilva <maugusto.silva@gmail.com>
…uated (#499) Signed-off-by: Nick Masluk <nick@randombytes.net>
…pot (#500) Signed-off-by: Nick Masluk <nick@randombytes.net>
If local `docker` or `podman` daemon is not running, do NOT select it as `LLMDBENCH_CONTROL_CCMD` Also, added a new helper script, `setup/preprocess/set_nixl_environment.py`, to help setting environment variables for `RoCE/GDR` Signed-off-by: maugustosilva <maugusto.silva@gmail.com>
* Initializing KubeCon tutorial Signed-off-by: Jing Chen <jing.chen2@ibm.com> * Fix link and spelling Signed-off-by: Jing Chen <jing.chen2@ibm.com> * Update Signed-off-by: Jing Chen <jing.chen2@ibm.com> * Fix link Signed-off-by: Jing Chen <jing.chen2@ibm.com> * Fix link Signed-off-by: Jing Chen <jing.chen2@ibm.com> * Show broken links Signed-off-by: Jing Chen <jing.chen2@ibm.com> * Fix link again Signed-off-by: Jing Chen <jing.chen2@ibm.com> * Kubecon link Signed-off-by: Jing Chen <jing.chen2@ibm.com> * Rm kind ref Signed-off-by: Jing Chen <jing.chen2@ibm.com> * Address feedback Signed-off-by: Jing Chen <jing.chen2@ibm.com> --------- Signed-off-by: Jing Chen <jing.chen2@ibm.com>
* [Standup] bug fix for failed k8s context in setup/functions.py * [Standup] omit error message for a failed k8s context but replace to the correct context
* kubecon2025 -> tutorials * Add basic guidellm tutorial * Fix broken link
[Refactor] Remove no longer used functions on `functions.sh` [Standup] Check for context files pointing to the wrong cluster (in addition to simply non-functional "stale" files) Signed-off-by: maugustosilva <maugusto.silva@gmail.com>
This function is still used by `run.sh` Signed-off-by: maugustosilva <marcio.a.silva@ibm.com>
* Fill in PD tutorial Signed-off-by: Jing Chen <jing.chen2@ibm.com> * Rm command Signed-off-by: Jing Chen <jing.chen2@ibm.com> --------- Signed-off-by: Jing Chen <jing.chen2@ibm.com>
Signed-off-by: Jing Chen <jing.chen2@ibm.com>
GKE precise prefix cache aware tutorial
This needed dependency was accidentally removed during the discontinuation of support for fmperf Signed-off-by: maugustosilva <marcio.a.silva@ibm.com>
… (#514) Added logic to use a CLI-provided namespace if the namespace cannot be detected from kubeconfig. Signed-off-by: Pete Cheslock <pete.cheslock@redhat.com>
… host details (#511) Signed-off-by: Nick Masluk <nick@randombytes.net>
* Update vLLM parameters * Add warmup step to vLLM benchmark --------- Signed-off-by: Nick Masluk <nick@randombytes.net>
…516) deployments Also fixed a bug on step 10 Signed-off-by: maugustosilva <maugusto.silva@gmail.com>
…ke optional (#518)
Signed-off-by: Diego-Castan <diego.castan@ibm.com>
Controlled by environment variable `LLMDBENCH_VLLM_COMMON_EXTRA_INIT_CONTAINER_CONFIG` (and associated `_MODELSERVICE_` equivalents), it allows the use of the main llm-d-benchmark image to performe "pre-processing" tasks. Converted all guides from mount `/preprocess` from a `ConfigMap` to use `initContainer` Reset version on all dependencies back to "auto" Updated `OWNERS` file Signed-off-by: maugustosilva <maugusto.silva@gmail.com>
Signed-off-by: maugustosilva <maugusto.silva@gmail.com>
Addresses #782 — CNCF and CI Audit - week of Mar 9 2026 - Add CODE_OF_CONDUCT.md - Add SECURITY.md - Add PR_SIGNOFF.md - Add .prowlabels.yaml - Add .github/CODEOWNERS Signed-off-by: Andrew Anderson <andy@clubanderson.com>
* feat: allow disabling of prometheus service monitor * feat: allow gateway resource configuration * fix: document gateway env vars * fix: small sim example --------- Signed-off-by: MICHAEL DESMOND <mdesmond@us.ibm.com>
Signed-off-by: maugustosilva <maugusto.silva@gmail.com>
* remove redundant code * add metrics summary into benchmark report v0.2 * fix the bug where METRICS_DIR was not set explicitly * remove redundant package in Dockerfile
* make epp verbosity as 4 when monitoring mode is on * add epp log analysis * remove headers in python scripts
…n (#859)
Add support for running experiments multiple times and aggregating results
with mean and standard deviation to account for benchmark variability.
Changes:
- existing_stack/run_only.sh: Add -R/--repeat flag (default: 1) and
LLMDBENCH_HARNESS_REPEAT env var. Each run gets a unique experiment ID
({uid}_{workload}_run{i}). After all runs, calls aggregation script.
- analysis/aggregate_runs.py: New script that reads Benchmark Report v0.2
files from repeated runs and produces aggregated_summary.json and
aggregated_summary.txt with mean, std, min, max for all metrics.
Backward compatible: --repeat 1 (default) produces identical behavior.
Closes #701
Signed-off-by: Jing Chen <jing.chen2@ibm.com>
Remove refs to benchmark report in config explorer
Signed-off-by: Nick Masluk <nick@randombytes.net>
Signed-off-by: Nick Masluk <nick@randombytes.net>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #120